The Best 190 Video Processing Tools in 2025
Timesformer Base Finetuned K400
TimeSformer is a video classification model pre-trained on the Kinetics-400 dataset, utilizing a spatiotemporal attention mechanism for video understanding.
Video Processing
Transformers

T
facebook
108.61k
33
Vivit B 16x2 Kinetics400
MIT
ViViT is an extension of the Vision Transformer (ViT) for video processing, particularly suitable for video classification tasks.
Video Processing
Transformers

V
google
56.94k
32
Animatediff Motion Lora Zoom In
Dynamic LoRAs can add specific types of motion effects to animations, such as zooming, panning, tilting, and rotation.
Video Processing
A
guoyww
51.43k
8
Videomae Base
VideoMAE is a video self-supervised pretraining model based on Masked Autoencoder (MAE), which learns internal video representations by predicting pixel values of masked video patches.
Video Processing
Transformers

V
MCG-NJU
48.66k
45
Dfot
MIT
A novel video diffusion model capable of generating high-quality videos from any number of context frames
Video Processing
D
kiwhansong
47.19k
6
Videomae Base Finetuned Kinetics
VideoMAE is a video self-supervised pre-training model based on Masked Autoencoder (MAE), fine-tuned on the Kinetics-400 dataset for video classification tasks.
Video Processing
Transformers

V
MCG-NJU
44.91k
34
Mochi 1 Preview
Apache-2.0
A high-fidelity video generation model developed by Genmo, featuring exceptional motion expressiveness and precise prompt adherence
Video Processing English
M
genmo
27.13k
1,216
Animatediff Motion Lora Zoom Out
Dynamic LoRAs can add specific types of motion effects to animations
Video Processing
A
guoyww
11.43k
5
Ppo SpaceInvadersNoFrameskip V4
This is a reinforcement learning agent based on the PPO algorithm, specifically designed for training and gameplay in the SpaceInvadersNoFrameskip-v4 game environment.
Video Processing
P
sb3
8,999
0
Stable Video Diffusion Img2vid Xt 1 1
Other
Stable Video Diffusion (SVD) 1.1 is a diffusion model-based image-to-video tool capable of generating short video clips from static images as conditional frames.
Video Processing
S
vdo
8,560
28
Videomaev2 Large
VideoMAEv2-Large is a large-scale video feature extraction model pre-trained with self-supervision on the UnlabeldHybrid-1M dataset
Video Processing
V
OpenGVLab
5,581
1
Animatediff Motion Lora Pan Left
Motion LoRAs can add specific types of motion effects to your animations
Video Processing
A
guoyww
5,209
2
Animatediff Motion Lora Tilt Down
Dynamic LoRAs model for adding specific types of motion effects to text-to-video animations
Video Processing
A
guoyww
5,091
4
Wan2.1 FLF2V 14B 720P Gguf
Apache-2.0
Wan2.1-FLF2V-14B-720P is a video generation model that supports generating videos from images, suitable for various video creation scenarios.
Video Processing Supports Multiple Languages
W
city96
5,019
17
Animatediff Motion Lora Pan Right
The dynamic LoRA model can add specific types of motion effects to animations, such as zoom in/out, panning, tilting, and rotation.
Video Processing
A
guoyww
4,923
2
Videomae Large Finetuned Kinetics
VideoMAE is a self-supervised video pre-training model based on masked autoencoder, fine-tuned on the Kinetics-400 dataset for video classification tasks.
Video Processing
Transformers

V
MCG-NJU
4,657
12
Timesformer Base Finetuned K600
TimeSformer is a video classification model pretrained on the Kinetics-600 dataset, utilizing a spatiotemporal attention mechanism to process video data.
Video Processing
Transformers

T
facebook
4,026
12
Videomaev2 Base
VideoMAEv2-Base is a self-supervised video feature extraction model that employs a dual masking mechanism pre-trained on the UnlabeldHybrid-1M dataset.
Video Processing
V
OpenGVLab
3,565
5
Moviigen1.1 GGUF
Apache-2.0
MoviiGen1.1 is a video generation model based on GGUF format conversion, supporting text-to-video tasks.
Video Processing
M
wsbagnsv1
3,522
18
Videomae Large
VideoMAE is a video self-supervised pre-training model based on Masked Autoencoder (MAE), which learns video representations by predicting pixel values of masked video patches
Video Processing
Transformers

V
MCG-NJU
3,243
31
Videomae Huge Finetuned Kinetics
VideoMAE is a video pretraining model based on Masked Autoencoder (MAE), fine-tuned on the Kinetics-400 dataset through self-supervised learning, suitable for video classification tasks.
Video Processing
Transformers

V
MCG-NJU
2,984
4
Timesformer Hr Finetuned K600
TimeSformer is a video classification model based on spatio-temporal attention mechanisms, specifically designed for video understanding tasks.
Video Processing
Transformers

T
facebook
2,927
6
Liveportrait
MIT
LivePortrait is an efficient portrait animation generation model that achieves static image to dynamic video conversion through stitching and redirection control technology
Video Processing
L
KwaiVGI
2,495
389
Videomae Small Finetuned Kinetics
VideoMAE is a masked autoencoder model for video, pretrained with self-supervision and fine-tuned on the Kinetics-400 dataset, suitable for video classification tasks.
Video Processing
Transformers

V
MCG-NJU
2,152
1
Cakeify
Apache-2.0
A LoRA trained on the Wan2.1 14B I2V 480p model, capable of transforming any object in an image into a cake-effect video
Video Processing English
C
Remade-AI
1,955
16
Vivit B 16x2 Kinetics400 Finetuned Cctv Surveillance
MIT
A video action recognition model based on the ViViT architecture, fine-tuned specifically for CCTV surveillance scenarios, excelling in action recognition tasks.
Video Processing
Transformers

V
ratchy-oak
1,939
1
Inflate
Apache-2.0
A LoRA trained based on the Wan2.1 14B I2V 480p model, which can convert static images into dynamic videos with an inflatable effect.
Video Processing English
I
Remade-AI
1,903
11
Animatediff Motion Lora Rolling Clockwise
AnimateDiff motion adapter model for adding specific motion effects to generated animations
Video Processing
A
guoyww
1,548
1
Animatediff Motion Lora V1 5 3
Dynamic LoRAs can add specific types of motion effects to animations, such as zoom in/out, panning, tilting, and rotation.
Video Processing
A
guoyww
1,438
4
Hyvid I2v Gguf
Other
An image-to-video model developed by Tencent Hunyuan Community, capable of converting input text descriptions into dynamic video content.
Video Processing English
H
calcuis
1,212
6
Videomaev2 Huge
VideoMAEv2-Huge is a self-supervised learning-based video feature extraction model, pre-trained for 1200 epochs on the UnlabeledHybrid-1M dataset.
Video Processing
V
OpenGVLab
1,145
1
Animatediff Motion Lora Rolling Anticlockwise
Dynamic LoRAs model for adding specific types of motion effects to text-generated animations
Video Processing
A
guoyww
1,129
1
Videomaev2 Giant
VideoMAEv2-giant is an ultra-large-scale video classification model based on self-supervised learning, employing a dual masking strategy for pretraining.
Video Processing
V
OpenGVLab
1,071
4
Vivit B 16x2
MIT
ViViT is an extension of the Vision Transformer (ViT) for video processing, primarily used for downstream tasks such as video classification.
Video Processing
Transformers

V
google
989
11
Videomae Base Finetuned Ssv2
VideoMAE is a video self-supervised pretraining model based on Masked Autoencoder (MAE), fine-tuned on the Something-Something-v2 dataset for video classification tasks.
Video Processing
Transformers

V
MCG-NJU
951
6
Skyreels V2 I2V 14B 540P GGUF
Other
SkyReels-V2-I2V-14B-540P is a GGUF-format converted image-to-video model that supports generating dynamic video content from static images.
Video Processing
S
wsbagnsv1
929
8
Videomae Base Short
VideoMAE is a video self-supervised pretraining model based on Masked Autoencoder (MAE), which learns internal video representations through masked patch prediction, suitable for downstream tasks like video classification.
Video Processing
Transformers

V
MCG-NJU
886
3
Animatediff Motion Adapter V1 5 3
AnimateDiff is a technology that leverages existing Stable Diffusion text-to-image models to create videos by inserting motion module layers to achieve coherent motion between image frames.
Video Processing
A
guoyww
800
8
Skyreels V2 I2V 14B 720P GGUF
Other
SkyReels-V2-I2V-14B-720P is an image-to-video generation model capable of converting static images into dynamic videos.
Video Processing
S
wsbagnsv1
724
4
Kissing
Apache-2.0
LoRA trained on the Wan2.1 14B I2V 480p model for generating kissing interaction videos from images
Video Processing English
K
Remade-AI
686
7
Stable Video Diffusion Img2vid Xt 1 1
Other
A latent diffusion model for generating short video clips from static images, supporting 25-frame video generation at 1024x576 resolution
Video Processing
S
weights
682
6
Animatediff Motion Adapter V1 5
AnimateDiff is a technology that enables existing Stable Diffusion text-to-image models to generate videos by inserting motion module layers to achieve coherent motion between frames.
Video Processing
A
guoyww
649
3
- 1
- 2
- 3
- 4
- 5